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Abstract 

The small-world network model is a simple model of the structure of social 
networks, which simultaneously possesses characteristics of both regular lat- 
tices and random graphs. The model consists of a one-dimensional lattice 
with a low density of shortcuts added between randomly selected pairs of 
points. These shortcuts greatly reduce the typical path length between any 
two points on the lattice. We present a mean-field solution for the average 
path length and for the distribution of path lengths in the model. This solu- 
tion is exact in the limit of large system size and either large or small number 
of shortcuts. 
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Social networks, such as networks of friends, have two characteristics which one might 
imagine were contradictory. First, they show "clustering," meaning that two of your friends 
are far more likely also to be friends of one another than two people chosen from the popu- 
lation at random. Second, they exhibit what has become known as the "small-world effect," 
namely that any two people can establish contact by going through only a short chain of 
intermediate acquaintances. Following the work of Milgram [Q, it is widely touted that the 
average number of such intermediates is about six — there are "six degrees of separation" 
between two randomly chosen people in the world. In fact this number is probably not a 
very accurate estimate, but the basic principle is sound. 

These two properties appear contradictory because the first is a typical property of low- 
dimensional lattices but not of random graphs or other high-dimensional lattices, while the 
second is typical of random graphs, but not of low-dimensional lattices. Recently, Watts and 
Strogatz have proposed a simple model of social networks which interpolates between low- 
dimensional lattices and random graphs and displays both the clustering and small-world 
properties. In this model, L sites are placed on a regular one-dimensional lattice with 
nearest- and next-nearest-neighbor connections out to some constant range k and periodic 
boundary conditions (the lattice is a ring). A number of shortcuts are then added between 
randomly chosen pairs of sites with probability per connection on the underlying lattice 
(of which there are Lk). Thus there are on average Lkcf) shortcuts in the graph. An example 
of a small-world graph with L = 24, k = 3, and four shortcuts is shown in Fig. |l|a. Watts 
and Strogatz examined numerically the average distance between pairs of vertices on small- 
world graphs and found that only a small density of shortcuts is needed to produce distances 
comparable to those seen in true random graphs. At the same time, the model shows the 
clustering which is characteristic of real social networks. 

In this paper we derive an analytic solution for the distribution of path lengths in the 
small-world model. To do this we make use of a mean-field approximation in which the 
distributions of quantities over the randomness are represented by the average values of 
those quantities. However, as we will show, the mean-field theory turns out to be exact in 
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Figure 1 (a) A small-world graph of 24 sites with k = 3 and four shortcuts, 
(b) The continuum version of the same graph. The bold lines denote the portion of 
the graph which is within distance r of the point at the top denoted by the arrow. 
In this case there are four filled segments, or "clusters" , around the perimeter of the 
graph, or equivalently four gaps between clusters. 

the limit of large system size for the small-world model. 

The approach we use is first to solve the continuum version of the Watts-Strogatz model 
shown in Fig. |l]b. In this version of the model the underlying one- dimensional lattice is 
treated as a continuum, and one can measure the distance between any two points on this 
continuum, rather than between only a discrete set of lattice sites. Shortcuts are assumed 
to have length zero. Once we have a solution for the continuum model, we then note that if 
the density of shortcuts is low the discrete and continuum models are equivalent, and hence 
our solution is also a solution of the discrete small-world model. 

Consider then a "neighborhood" of radius r centered around a randomly chosen point 
on a small-world network of L sites, where by neighborhood we mean the set of points 
which can be reached by following paths of length r or less on the graph. Let m(r) be 
the number of sites on the graph which do not belong to this neighborhood, averaged over 
many realizations of the randomness in the graph, and n{r) be the average number of "gaps" 
around the lattice amongst which those sites are divided — see Fig. |l|b. Equivalently, n{r) 
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can be viewed as the number of "clusters" of occupied sites. In the continuum model both 

m and n are real numbers. We will also find it convenient to use the rescaled variables 

, , m(r) , , nir) , . 

/^W = y{r) = (1) 

In the continuum limit the quantities m(r) and n{r) satisfy differential equations as 
follows. The rate at which the number of empty sites on the lattice decreases with increasing 
r is equal to the number 2n of growing edges of clusters on the lattice times the range k of 
connections on the lattice. Thus 

or 

^ = -2ku. (3) 
dr 

This equation is exact for all values of L and 0. 

The rate at which the number of gaps n changes has two contributions. First, the number 
of gaps increases as a result of the shortcuts on the graph. If ^ = l/{k(j)) is the characteristic 
length defined in Ref. ||^ such that L/^ is the average number of shortcuts in the graph, then 
the density of the ends of shortcuts on the lattice is 2/^. This means that as r increases, new 
shortcuts are encountered at a rate Akn/^. For each shortcut encountered, a new cluster will 
be started at a random position on the lattice, provided that the other end of the shortcut 
in question falls in one of the gaps around the ring. The probability of this happening is 
m/L. Thus the rate at which clusters (or gaps) are created is 4kmn/C,L. 

The number of gaps decreases when the edges of a gap meet one another. This will 
happen in the interval from r to r + 5r if the size of one of the gaps is less than 2k 6r. If 
we consider all possible ways of distributing the m empty sites over n gaps, we can see that 
the probability distribution of the sizes of the gaps is the same as the distribution of the 
smallest of — 1 uniformly distributed random numbers x between and m, which is 

If 1 

p(x) = 1 . (4) 

m m 



Thus the probabihty of one particular gap being smaller than 2k Sr is 1 — (1 — 2kSr/m)'^~^, 
which tends to 2k{n — l) 6r/m in the limit of small 6r. The probability that any one of them 
is smaller than 2k 6r is n times this. Thus our final equation for the rate of change of n is 

dn 4kmn 2kn{n — 1) 



dr m ' 

or 

dv Akj^u 2kv{iJ-l/L) 



(5) 



(6) 



dr ^ jj, 

This equation is only exact when the average values /i(r) and z/(r) accurately represent the 
actual values of these quantities in the particular realization of the model we are looking 
at, i.e., when the distribution of values is sharply peaked. This will be the case when the 
number of shortcuts on the lattice is either much less than one — L <^ ^ — or much greater 
than one — L ^ ^ — and therefore also in the limit of large system size. We have confirmed 
this using numerical simulations, which show the distributions of /i and v to be sharply 
peaked in these limits but broad elsewhere. 

Between them, Eqs. (|]) and (|]) are the fundamental equations which lead to our solution 
for the small-world model. As demonstrated in the appendix, these equations can also be 
derived by writing down difference equations for the variables m and n in the discrete small- 
world model and then expanding in powers of the shortcut density 0=1/^ and keeping 
only the leading order terms. 

We solve Eqs. and (^) as follows. First, we take their ratio, which eliminates the 

variable r and gives us a single differential equation directly relating /x and z/ thus: 

di/ ^ _2/i ^ V- l/L 
d/i ^ H ' 

The general solution of this equation is 

^ = -^ + ^ + C^/^, (8) 

where C is an integration constant. The constant can be fixed using the boundary conditions 
/i(0) = 1, z/(0) = l/L, which imply that C = 2/^ and hence 



z/=|(/i-/x2) + l (9) 
Substituting this solution back into Eq. (H), we get 

f = j(A=-A) + ^. (10) 

If we neglect the constant term in this equation, we arrive at the normal logistic growth 
equation [Q, which will give an accurate solution for /i in the regime where the lattice is 
neither very full nor very empty. If we keep all the terms, the general solution given the 
boundary conditions is 



Ak 7i z^-z-^/2L 



Rearranging for fi this gives 
1 



tanh — , — tanh 



(11) 



^=2 



/ 1 /bT* 

1 + Vl + 2^/Ltanh tanh"^ , - 2^1 + 2{/L — 



(12) 



This equation gives us /i in terms of r, ^ and L for the continuum version of the small- world 
model. In the case where the typical lattice distance between the ends of shortcuts is much 
larger than one — ^ ^ 1 — the continuum version becomes equivalent to the normal discrete 
version of the model and so in this limit our solution is also a solution of the discrete small- 
world model. Combining this condition with the conditions specified earlier, we see that our 
solution will be exact when either 1 ^ L ^ ^, or when 1 ^ ^ ^ L. This latter regime is 
precisely the regime in which the small-world model is physically interesting: the regime of 
large system size and large number of sparsely distributed shortcuts. In the intermediate 
regime between the two conditions given, the solution is still quite accurate, and gives a 
good guide to the general behavior of the model, as we will shortly show. 

We now derive some of the more important consequences of Eq. ([T^) . First, we check 
that it reduces to the correct expression in the case L oo. Making use of the identity 

tanh xi + tanh a;2 

tanh(xi + X2) = ^ , ^ , — r — , (13) 

1 + tanha;i tanh 0:2 
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and we find that to first order in ^/L 

;. = l + ^[l-e^^'-/^], (14) 

which agrees with the direct derivation for the L = oo case in Ref. 0]. 

Next, we note that once we have the fraction /i of sites not belonging to a neighborhood 
of radius r, we can also calculate the number A{r) dr in an interval from r to r + dr — the 
"surface area" of the neighborhood — from 

A = -L^ = 2 + ^i^^-^^^). (15) 

Thus, once we have fi we can easily calculate A. 

We can also derive an expression for the average vertex-vertex separation i on the graph, 
a quantity which has been studied by many authors PJ^,^-^. We write 

£ = — / rA{r) dr = r d/i, (16) 
L Jo Jo 

where we have made use of Eq. (|15|). Even before performing the integral, we can see that 
this implies certain behavior on the part of i. Eq. (|lT]) shows that kr/C, is a function only 
of fi and of the ratio of L and ^. In other words r has the form 

r = |M/x,L/0, (17) 

where h{x, y) is a universal scaling function with no dependence on the parameters of the 
model other than through its arguments. Substituting this form into Eq. (0) and performing 
the integral over fi, we get 

i=l9{L/0, (18) 

where g{x) is another universal scaling function. Except for the leading factor of 1/k, this 
scaling form is identical to the one suggested previously by Barthelemy and Amaral 0. 
Making the substitution g{x) = xf{x), we can also write it in the form 

i = ^fiL/0 = ^fW), (19) 



a form which was proposed by Newman and Watts on the basis of renormahzation group 
arguments, and which has been confirmed by extensive numerical simulation 

The complete solution for ^ is obtained by substituting (|Il|) into ([16]) and performing 
the integral, which gives 

£ = , ^ tanh"^ , ^ (20) 

2k^Jl + 2i/L VI + 2i/L 

The scaling function f{x) is then given by 



fix) = , ^ tanh^^ . / ^— . (21) 

In Fig. H we show this form for the scaling function along with numerical data from direct 
measurements of the average path length on k = 1 discrete small-world graphs of size up to 
L = 10'' sites. As the figure shows, the two are in good agreement for large and small values 
of the independent variable x but, as expected, there is some disagreement in the region 
around x = 1 where ^ and L are of the same order of magnitude. 
The asymptotic forms of Eq. (|2T| ) are 

{7 for a; -C 1 

(22) 
(log2x)/4a; for x 3> 1, 

where we have made use of the identity 

1 + X 

tanh"^ X = i log . (23) 

^ 1 — X 

These forms are in agreement with previous conjectures [^0, which suggested that /(x) 
should have the value ^ for small x and should go as (logx)/x for large x. As we see, the 
leading numerical factor in the latter case is ^; this figure is exact, since Eq. (p!^ ) is exact 
for large L/^. 

In passing, we note that there is a simple physical interpretation of the scaling function 
/(x): apart from the leading factor of ^, it is the fraction by which the average path length 
on a small-world graph is reduced if the graph has x shortcuts. For example, Eq. ( pij ) 
indicates that it takes x = 3.5 shortcuts on average to reduce the mean path length by a 
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Figure 2 The average path length as a fraction of system size on a A; = 1 small- 
world graph, plotted against the average number L/^ of shortcuts. The circles are 
numerical measurements for the discrete model and the solid line is the analytic 
solution for the continuum model, Eq. ([2l]). The error bars on the numerical mea- 
surements are smaller than the points. Inset: the average path length on small- world 
graphs with L = 10^, for values of cj) from 0.01 up to 1 (circles) and the analytic 
solution, Eq. (|20| ) (dotted line). 



half, and 44 shortcuts to reduce it by a factor of ten. Thus only a small number of shortcuts 
are needed to reduce path lengths quite considerably. The same conclusion has been reached 
by Watts and Strogatz on the basis of numerical data. 

In the inset of Fig. ^ we show how our solution fails when the shortcut density becomes 
too high. The figure shows numerical results for £ for a variety of values of the shortcut 
density cf) = 1/^ from 0.01 up to 1, for systems of one million sites (circles). The dotted line 
is Eq. (1^). As the figure shows, the analytic solution is a reasonable guide to the behavior 
of £ up to quite large values of but, as expected, fails when gets close to 1. 

To conclude, we have given a mean-field-like analytic solution for the distribution of path 
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lengths in the continuum version of the Watts-Strogatz small-world model. This solution is 
exact in the limit of large system sizes for a given density of shortcuts, or in the limit of low 
shortcut density for given system size. In the case where the shortcut density is low but the 
total number of shortcuts on the lattice is large (because the lattice itself is also large) our 
solution is also an exact solution of the normal discrete small-world model. We have also 
derived an expression for the average path length in the model and from this extracted the 
scaling forms which this path length obeys. We have checked our results against numerical 
simulation of the discrete small-world model and find good agreement in the regions where 
our solution is expected to be exact. In other regions the solution is a good guide to the 
general behavior of the model but shows some deviation from the numerical results. 
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APPENDIX: THE CONTINUUM MODEL AS THE 
SMALL-0 LIMIT OF THE DISCRETE MODEL 



In this appendix, we rederive Equations (|^) and from the behavior of the discrete 
version of the small-world model to leading order in the shortcut density 0. 

Consider a neighborhood of sites which are within distance r of a given starting site in 
the discrete model. By analogy with the continuous case, let m be the number of sites on the 
lattice which are not in this neighborhood and n be the number of "gaps" between clusters 
of occupied sites around the ring. In fact, in the spirit of our mean-field approximation, m 
and n should be thought of as the average of these quantities over all possible realizations of 
lattice. This means that they may have non-integer values. Here we treat them as integers 
for combinatorial purposes, but our formulas are easily extended to non-integer values by 
replacing the factorials by F-functions. 

When we increase r by one, the value of m decreases for two reasons: first because the 
gaps between clusters shrink and second because of new sites which are reached by traveling 
down shortcuts encountered on the previous step. We write 

m' = m — Am (Al) 

where 

Am = Arrig + Arris, (A2) 

with the two terms representing the shrinking of gaps and the shortcut contribution respec- 
tively. 

To calculate Arrig, we note that the probability of any particular gap having size j is 

rm-j-l\ 

= (A3) 

\n-l) 

and the average number of such gaps is n times this. When we increase r by one, gaps of 
size 2k or larger shrink by 2k, while gaps smaller than 2k vanish altogether. Thus 



11 



" 2k 2k 



(m — 2k)\ {m — n)\ 



Arrig = n 



= m — 



{m — 1) \ {m — n — 2k)\ 



(A4) 



Li=i i=i 



To calculate the contribution Am^ from the number of shortcuts, we note that the 
probability of encountering the end of a shortcut at any given site is 2/^ = 2k(j), just as in 
the continuum case, so the number of new shortcuts encountered when we increase r by one 
is 2k(f)Am. Arris in fact depends on the number of shortcuts encountered on the previous 
increase in r, so we need to write 2k(f) Am^^~^\ Only those shortcuts which land us at one 
of the m — Arrig unoccupied sites contributes to Arrig, so 



Substituting Eqs. ( |A4|) and (^) into Eq. ( |A2|) we get our complete expression for Am. 
Now we note that, except when the lattice is very nearly full, the number of unfilled sites 
m is of order L. The number of clusters of filled sites can be no greater than the number 
of shortcuts on the lattice plus one for the initial cluster around the starting point. Thus 
n < (f)L + 1, and the ratio (n — l)/m is a quantity of order 0. Expanding in powers of this 
quantity and assuming the number of sites m to be much greater than 2k then gives 



plus terms of order (j). Physically, the reason for the simplicity of this expression is that, to 
first order in {n — l)/m, most gaps have size 2k or larger, and the contribution from new 
shortcuts to can be neglected, since most sites are connected only to their local neighbors. 

The change in the value of n has also two contributions. First, the number of gaps 
increases when a shortcut creates a new cluster, and divides a gap into two new ones. As we 
have already shown this happens Arris times on average when we increase r by one, where 
Arrig is given by Eq. (|A5|) . Second, gaps disappear when their edges meet. When we increase 
r by one, a gap will close if its size is 2k or less. Thus the change in n is 



Am, = 2fc0 Am('^-^)[m - Am^^^]. 



(A5) 



Am = 2kn, 



(A6) 



n' = n + An, 



(A7) 
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where 

An = Am, - n > = Am, - n ) -f.^ A8 

^ (m — 1)! (m — n — 2fc)! 



where Am, is given by Eq. (|A5|) . Expanding to lowest order in (n — l)/m, taking m ^ 2k 
again, and combining the result with Eq. (^) gives 



Ak'^chmn 2kn(n — 1) , , , 

An = ^ ^ -. A9 

L m 

Changing Eqs. (|A6|) and (X9) from difference equations to differential ones and dividing 
by L to transform from m and n to fi and u gives 

^ = -2kiy, (AlO) 

ctr 

.,2, 2kiy(iy — l/L) 
— = 4r0ut/H ^ —. (All) 

Recalling that ^ = 1/(A;0), we can see that these equations are identical to Equations 
and (M). 
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